10 research outputs found

    Least Dependent Component Analysis Based on Mutual Information

    Get PDF
    We propose to use precise estimators of mutual information (MI) to find least dependent components in a linearly mixed signal. On the one hand this seems to lead to better blind source separation than with any other presently available algorithm. On the other hand it has the advantage, compared to other implementations of `independent' component analysis (ICA) some of which are based on crude approximations for MI, that the numerical values of the MI can be used for: (i) estimating residual dependencies between the output components; (ii) estimating the reliability of the output, by comparing the pairwise MIs with those of re-mixed components; (iii) clustering the output according to the residual interdependencies. For the MI estimator we use a recently proposed k-nearest neighbor based algorithm. For time sequences we combine this with delay embedding, in order to take into account non-trivial time correlations. After several tests with artificial data, we apply the resulting MILCA (Mutual Information based Least dependent Component Analysis) algorithm to a real-world dataset, the ECG of a pregnant woman. The software implementation of the MILCA algorithm is freely available at http://www.fz-juelich.de/nic/cs/softwareComment: 18 pages, 20 figures, Phys. Rev. E (in press

    Estimating Mutual Information

    Get PDF
    We present two classes of improved estimators for mutual information M(X,Y)M(X,Y), from samples of random points distributed according to some joint probability density μ(x,y)\mu(x,y). In contrast to conventional estimators based on binnings, they are based on entropy estimates from kk-nearest neighbour distances. This means that they are data efficient (with k=1k=1 we resolve structures down to the smallest possible scales), adaptive (the resolution is higher where data are more numerous), and have minimal bias. Indeed, the bias of the underlying entropy estimates is mainly due to non-uniformity of the density at the smallest resolved scale, giving typically systematic errors which scale as functions of k/Nk/N for NN points. Numerically, we find that both families become {\it exact} for independent distributions, i.e. the estimator M^(X,Y)\hat M(X,Y) vanishes (up to statistical fluctuations) if μ(x,y)=μ(x)μ(y)\mu(x,y) = \mu(x) \mu(y). This holds for all tested marginal distributions and for all dimensions of xx and yy. In addition, we give estimators for redundancies between more than 2 random variables. We compare our algorithms in detail with existing algorithms. Finally, we demonstrate the usefulness of our estimators for assessing the actual independence of components obtained from independent component analysis (ICA), for improving ICA, and for estimating the reliability of blind source separation.Comment: 16 pages, including 18 figure

    Bivariate surrogate techniques: necessity, strengths, and caveats

    No full text
    The concept of surrogates allows testing results from time series analysis against specified null hypotheses. In application to bivariate model dynamics we here compare different types of surrogates, each designed to test against a different null hypothesis, e.g., an underlying bivariate linear stochastic process. Two measures that aim at a characterization of interdependence between nonlinear deterministic dynamics were used as discriminating statistics. We analyze eight different stochastic and deterministic models not only to demonstrate the power of the surrogates, but also to reveal some pitfalls and limitations.T.K. and F.M. acknowledge support from the Deutsche Forschungsgemeinschaft

    Measure profile surrogates: a method to validate the performance of epileptic seizure prediction algorithms

    Get PDF
    In a growing number of publications it is claimed that epileptic seizures can be predicted by analyzing the electroencephalogram (EEG) with different characterizing measures. However, many of these studies suffer from a severe lack of statistical validation. Only rarely are results passed to a statistical test and verified against some null hypothesis H 0 in order to quantify their significance. In this paper we propose a method to statistically validate the performance of measures used to predict epileptic seizures. From measure profiles rendered by applying a moving-window technique to the electroencephalogram we first generate an ensemble of surrogates by a constrained randomization using simulated annealing. Subsequently the seizure prediction algorithm is applied to the original measure profile and to the surrogates. If detectable changes before seizure onset exist, highest performance values should be obtained for the original measure profiles and the null hypothesis. “The measure is not suited for seizure prediction” can be rejected. We demonstrate our method by applying two measures of synchronization to a quasicontinuous EEG recording and by evaluating their predictive performance using a straightforward seizure prediction statistics. We would like to stress that the proposed method is rather universal and can be applied to many other prediction and detection problems.This work was supported by the Deutsche Forschungsgemeinschaft (Grant No. SFB TR3)
    corecore